Text Generation for Abstractive Summarization

نویسندگان

  • Pierre-Etienne Genest
  • Guy Lapalme
چکیده

We have begun work on a framework for abstractive summarization and decided to focus on a module for text generation. For TAC 2010, we thus move away from sentence extraction. Each sentence in the summary we generate is based on a document sentence but it usually contains a smaller amount of information and uses fewer words. The system uses the output of a syntactic parser for a sentence and then regenerates part of the sentence using a Natural Language Generation engine. The sentences of the summary are selected among regenerated sentences based on the document frequency of contained words, while avoiding redundancy. Date and location were handled and generated especially for cluster categories 1 and 2. Even though our initial scores were not outstanding, we intend to continue work on this approach in the coming years.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Abstractive Document Summarization with a Graph-Based Attentional Neural Model

Abstractive summarization is the ultimate goal of document summarization research, but previously it is less investigated due to the immaturity of text generation techniques. Recently impressive progress has been made to abstractive sentence summarization using neural models. Unfortunately, attempts on abstractive document summarization are still in a primitive stage, and the evaluation results...

متن کامل

Extractive vs. NLG-based Abstractive Summarization of Evaluative Text: The Effect of Corpus Controversiality

Extractive summarization is the strategy of concatenating extracts taken from a corpus into a summary, while abstractive summarization involves paraphrasing the corpus using novel sentences. We define a novel measure of corpus controversiality of opinions contained in evaluative text, and report the results of a user study comparing extractive and NLG-based abstractive summarization at differen...

متن کامل

Framework for Abstractive Summarization using Text-to-Text Generation

We propose a new, ambitious framework for abstractive summarization, which aims at selecting the content of a summary not from sentences, but from an abstract representation of the source documents. This abstract representation relies on the concept of Information Items (INIT), which we define as the smallest element of coherent information in a text or a sentence. Our framework differs from pr...

متن کامل

Multilingual Natural Language Generation within Abstractive Summarization

With the tremendous amount of textual data available in the Internet, techniques for abstractive text summarization become increasingly appreciated. In this paper, we present work in progress that tackles the problem of multilingual text summarization using semantic representations. Our system is based on abstract linguistic structures obtained from an analysis pipeline of disambiguation, synta...

متن کامل

Génération de résumés par abstraction complète

This Ph.D. thesis is the result of several years of research on automatic text summarization. Three major contributions are presented in the form of published and yet to be published papers. They follow a path that moves away from extractive summarization and toward abstractive summarization. The first article describes the HexTac experiment, which was conducted to evaluate the performance of h...

متن کامل

Faithful to the Original: Fact Aware Neural Abstractive Summarization

Unlike extractive summarization, abstractive summarization has to fuse different parts of the source text, which inclines to create fake facts. Our preliminary study reveals nearly 30% of the outputs from a state-of-the-art neural summarization system suffer from this problem. While previous abstractive summarization approaches usually focus on the improvement of informativeness, we argue that ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010